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ABSTRACT 

The  objective  of  this  research  is  to  develop  a  nonlinear 
regulator  for  an  adaptive  control  system  using  backpropagating 
neural  networks  (BNN's)  in  conjunction  with  a  linear  quadratic 
regulator  (LQR) .  The  basic  concepts  of  adaptive  control  and 
the  structure  of  neural  networks  are  discussed.  These 
concepts  are  integrated  and  the  nonlinear  regulator  is 
derived.  Simulation  is  conducted  on  a  representative 
nonlinear  system  with  both  the  LQR  and  the  nonlinear 
regulator.  Training  of  the  regulator  and  its  performance 
under  varying  BNN  parameter  values  are  examined. 

The  simulation  results  show  that  the  nonlinear  regulator 
with  BNN's  exhibits  superior  performance  compared  to  the  LQR 
when  the  nonlinearities  are  large.  The  optimization  of 
regulator  performance  with  regard  to  BNN  parameter  values  is 
discussed. 

Further  research  is  required  in  order  to  determine  the 
general  applicability  of  this  regulator  and  to  develop  more 
specific  guidelines  for  BNN  parameters. 
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I .   INTRODUCTION 

A.  OBJECTIVE 

The  objective  of  this  thesis  is  to  develop  a  nonlinear 
regulator  for  an  adaptive  control  system  using  backpropagating 
neural  networks  (BNN's)  in  conjunction  with  a  linear  quadratic 
regulator  (LQR) .  Discrete  time  models  are  used  to  represent 
the  systems  for  simulation  and  analysis. 

B.  BASIC  CONCEPTS  OF  ADAPTIVE  CONTROL 

There  are  inherent  nonlinearities  in  most  control  systems 
due  to  elements  such  as  environmental  changes,  minor  system 
component  failures,  random  time-varying  parameters,  and  hard 
limits  caused  by  physical  constraints.  These  nonlinearities 
may  be  handled  by  an  optimal  control  design  if  it  is 
sufficiently  robust,  but  with  large  uncertainties  the 
controller  has  to  sacrifice  performance  in  favor  of  robustness 
and  this  might  be  unsatisfactory. 

One  method  of  handling  these  nonlinearities  is  to  use  an 
adaptive  control  system.  An  adaptive  control  system  is  one 
which  continuously  and  automatically  measures  the  dynamic 
characteristics  of  the  plant,  compares  them  with  the  desired 
attributes,  and  uses  the  difference  to  vary  adjustable  system 
parameters  so  that  optimal  performance  can  be  maintained 
regardless  of  the  nonlinearities  encountered  [Ref .  1:  p.  792] . 


1.  Performance  Indexes 

The  basis  of  adaptive  control  rests  on  the  premise  that 
there  is  some  performance  of  the  system  which  is  optimal. 
Optimal  performance  is  defined  by  specifying  a  performance 
index  to  measure  the  closeness  of  the  controlled  system  to  its 
goal.  There  is  an  infinite  number  of  possibilities  in  the 
choice  of  a  performance  index.  The  choice  of  a  particular 
index  depends  on  the  system  and  the  desired  results.  In  most 
cases  the  choice  of  index  involves  a  compromise  of  minimizing 
the  system  costs  while  maximizing  the  system  performance. 

The  major  drawback  of  performance  indexes  is  while  they 
specify  the  cost  of  system  operation  they  do  not  give 
information  about  the  transient  response  of  the  system  [Ref. 
1:  p. 793] .  A  system  that  operates  optimally  according  to  the 
performance  index  may  have  undesirable  transient 
characteristics.  The  transient  response  of  the  system  must  be 
analyzed  to  validate  the  choice  of  weighting  matrices  used  in 
the  index. 

2.  Adaptive  Control  Systems 

An  adaptive  control  system  may  include  the  following  three 
functions: 

•  Classification  of  the  dynamic  characteristics  of  the 
plant . 

•  Decision  making  based  on  the  classification  of  the  plant. 

•  Modification  based  upon  the  decisions  made. 


If  the  plant  parameters  are  not  exactly  correct,  then  the 
initial  classification,  decision,  and  modification  procedures 
will  be  insufficient  to  optimize  the  performance  index.  It  is 
then  necessary  to  continuously  carry  out  these  procedures 
throughout  the  period  of  operation.  This  constant  redesign  of 
the  system  identifies  an  adaptive  control  system  [Ref.  1: 
p. 793-794] . 

C.   NEURAL  NETWORKS  IN  ADAPTIVE  CONTROL 

While  control  theory  for  linear  time -invariant  (LTD 
systems  is  a  relatively  mature  field,  nonlinear  control 
systems  generally  must  be  designed  on  a  system- to- system 
basis.  Neural  networks,  with  their  inherent  adaptability, 
have  the  potential  for  wide  application  in  nonlinear  control 
systems.  The  ability  of  a  neural  network  to  be  trained 
suggests  that  a  neural  network  may  be  able  to  successfully 
control  nonlinear  systems  which  have  poor  performance  when 
regulated  by  linear  time- invariant  controllers.  This  thesis 
will  investigate  the  use  of  neural  networks  in  conjunction 
with  LTI  control  methods  to  construct  a  nonlinear  regulator 
[Ref.  2:  p. 410].  This  regulator  will  be  implemented  on  a 
system  that  may  be  modeled  as  LTI  with  an  added  nonlinearity 
which  either  makes  control  by  LTI  methods  inefficient  or 
unstable. 

Two  neural  networks  shall  be  used  in  the  formulation  of 
the  regulator:    one  in  parallel  with  the  estimated  plant 


parameters  to  generate  a  better  state  vector  estimate  and  one 
in  parallel  with  the  linear  control  to  produce  an  improved 
control  input  to  the  system.  Chapter  II  discusses  a  general 
neural  network  structure  and  the  derivation  of  the  nonlinear 
regulator.  The  performance  of  the  regulator  on  a  particular 
system  is  presented  in  Chapter  III  and  conclusions  are  given 
in  Chapter  IV. 


II.   BNN  NONLINEAR  ADAPTIVE  CONTROL 

A.   BASIC  CONCEPTS  OF  NEURAL  NETWORKS 

Artificial  neural  networks  are  made  up  of  elements  which 
operate  in  a  manner  analogous  to  the  most  simple  functions  of 
biological  neurons.  These  elements  are  linked  in  a  fashion 
which  may  be  similar  to  the  connections  within  the  human 
brain.  Whether  or  not  artificial  neural  networks  actually  are 
representative  of  the  construction  of  the  brain,  they  do  show 
characteristics  which  are  reminiscent  of  the  human  brain.  An 
artificial  neural  network  may  be  trained,  it  can  recognize 
patterns,  and  it  may  apply  its  learning  from  past  lessons  to 
new  data.  These  traits  are  quite  limited,  however  they  lend 
themselves  to  a  wide  field  of  applications. 

1.   The  Artificial  Neuron 

The  artificial  neuron  is  meant  to  mimic  the  first - 
order  characteristics  of  the  biological  neuron.  A  set  of 
inputs  is  applied,  each  input  representing  the  output  of 
another  neuron.  Each  input  is  multiplied  by  a  corresponding 
linkweight  and  all  of  the  weighted  inputs  are  summed.  This 
sum  is  the  activation  level  of  the  neuron.  Figure  1  shows  a 
detailed  representation  of  a  single  neuron. 

In  Figure  1  the  inputs  are  labeled  x,,  x2,...,xH. 
These  inputs  are  defined  collectively  as  the  {n   x  2)  column 
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Figure  1.   A  Neuron 
node  produces  a  scalar  output 

z(M   which  may  be  stated  in  vector  notation: 

xf-arf™.  (2-D 

The  output  of  the  summation  node,  z* ,  is  further 
processed  by  an  activation  function  f,  to  produce  the  neuron's 
output  signal  xfi 

x?  =  f(z?)  .  (2-2) 

There  are  numerous  activation  functions  such  as  a 
simple  threshold,  a  hard-limited  linear  function,  the 
hyperbolic  tangent,  and  the  sigmoid.  These  functions  are 
shown  in  Figure  2.  The  main  purpose  of  the  activation 
function  is  to  serve  as  a  nonlinear  gain  so  that  each  neuron 
maps  a  wide  range  of  inputs  into  a  bounded  output. 
2.   Single  Layer  Neural  Networks 

By  themselves,  neurons  have  limited  capabilities, 
unless  they  are  grouped  together  in  layered  networks.   Figure 
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Figure  2.   Various  Activation  Functions 

3  shows  a  single  layer  neural  network  with  its  inputs  on  the 
far  left  of  the  figure  and  the  outputs  on  the  right.  The 
single  layer  neural  network  only  performs  the  vector 
multiplication;  there  is  no  activation  function. 

The  circles  on  the  left  of  Figure  3  serve  only  as 
distribution  points  for  the  inputs;  they  do  not  perform  any 
calculations  so  they  are '  not  neurons.  Each  element  of  the 
input  vector  x  is  applied  to  each  neuron.  The  linkweights  may 
be  considered  as  a  matrix  a  with  m  rows  and  n  columns  where  m 
is  the  number  of  outputs  and  n  is  the  number  of  inputs.  The 
number  n  is  one  greater  than  the  dimension  of  the  input  vector 
x  due  to  the  bias  term.  Using  equation  (2-1) ,  z  is  the  (m  x 
1)    output  vector: 
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Figure  3 .   Single  Layer  Neural  Network 


z  =  ax. 


(2-3) 


The  non-zero  bias  term  is  added  as  an  input  to  each  neuron. 
This  bias  input  is  also  multiplied  by  its  own  linkweight. 
3.   Multiple  Layer  Neural  Networks 

Neural  networks  with  multiple  layers  offer  greater 
computational  capabilities  than  single  layer  networks.  These 
networks  are  formed  by  cascading  a  number  of  single  layers 
with  the  outputs  of  one  layer  providing  the  inputs  to  the  next 
layer.  These  middle  layers  are  known  as  hidden  layers. 
Figure  4  shows  a  multiple  layer  neural  network.  The  key  to 
providing  the  extra  power  of  the  multiple  layer  networks  is 


Layer  2 


Bias 


Layer  M 


Figure  4.   Multiple  Layered  Neural  Network 

that  the  activation  function  is  included  in  the  hidden  layers 
of  the  network.  Otherwise,  the  multiple  layer  network  could 
be  modeled  by  a  single  layer  network  which  had  a  linkweight 
matrix  equal  to  the  product  of  the  individual  linkweight 
matrices  [Ref.  3:  p. 19].  The  output  layer  usually  has  the 
function  f  (x)  =  x,  waiving  the  limits  of  the  activation 
function. 

4.   Training  of  Neural  Networks 

A  network  is  trained  so  that  a  set  of  inputs  will 
produce  a  desired  set  of  outputs.  Training  is  accomplished  by 
applying  an  input  vector,  computing  the  output  vector, 
comparing  it  to  the  desired  vector,  and  modifying  the 
linkweight-s  by  a  predetermined  algorithm.   As  the  network  is 


trained,  the  linkweights  will  converge  to  values  which  will 
produce  the  desired  output  vector. 

There  are  several  methods  of  training  a  neural 
network,  one  of  which  is  the  backpropagation  routine.  This  is 
the  algorithm  used  to  train  the  neural  networks  used  in  this 
thesis.   The  next  section  discusses  this  technique. 
5.   Backpropagating  Neural  Networks 

The  backpropagating  neural  network  is  structured  as 
shown  in  Figure  4 .  The  signal  flows  from  input  to  output .  We 
assume  no  connections  between  the  neurons  of  a  layer  and  no 
feedback  from  any  layer  to  the  previous  layers. 
Backpropagation  refers  to  the  method  used  to  adjust  the 
linkweights  throughout  the  neural  network. 

The  output  of  a  neuron  in  the  last  layer  (xf4)  is  used 
with  a  desired  output  (x,gt)  to  produce  an  error  signal  (ef1)  . 
This  error  signal  is  multiplied  by  the  first  derivative  of  the 
activation  function  for  that  neuron.   Mathematically, 

b&   =  f'U/)  ixf-x^)  (2_4) 

=  f'(xf)  (e/)  . 

Then  6  is  multiplied  by  the  output  from  the  source  neuron  for 
the  linkweight  which  is  to  be  updated  and  in  turn  is 
multiplied  by  a  scaling  factor  /x,  the  learning  rate.  This 
results  in 
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A?}1  =  iffx?*  (2-5) 

and 

3ij.  t*l    =  *ij.t-V  *ij     >  12-61 

where 

•  a^"2  is  the  linkweight  from  neuron  i  in  the  layer  Af-1  to 
neuron  j   in  the  output  layer  M, 

•  6^  is  the  value  of  6   for  the  linkweight  sl^4'1  , 

•  t+1   denotes  the  updated  linkweight. 

The  linkweights  in  the  hidden  layers  cannot  be  trained 
by  this  process,  since  there  is  no  available  target  vector. 
Backpropagation  trains  the  hidden  layer  linkweights  by 
propagating  the  output  error  back  through  the  network, 
adjusting  the  linkweights  for  each  layer.  Equations  (2-4)  , 
(2-5)  ,  and  (2-6)  are  still  used  for  the  hidden  layers,  but  the 
target  vector  must  be  generated  differently.  The  6  is 
calculated  for  each  neuron  in  the  output  layer,  as  in  Equation 
(2-4)  .  The  linkweights  feeding  the  output  layer  are  adjusted 
using  Equations  (2-5)  and  (2-6)  .  The  d^'1  is  propagated  back 
through  the  network  to  generate  a  value  for  <5  for  each  neuron 
in  the  hidden  layer.  These  values  are  used  to  adjust  the 
weights  of  the  preceding  hidden  layer,  all  the  way  back  to  the 
linkweights  that  act  upon  the  inputs. 

This  is  most  easily  shown  in  vector  notation.   The 
vector  of  6's  for  the  output  layer  is  defined  as  DM_,   and  the 
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set  of  linkweights  for  the  output  layer  as  the  matrix  AM,t.  To 
find  DM.2,  multiply  DM.j  by  the  transpose  of  the  matrix  AM_;. 
Then  multiply  each  component  of  this  vector  by  the  first 
derivative  of  the  activation  function  for  the  corresponding 
neuron  in  the  hidden  layer.  This  yields  the  vector  of  d's  (DM, 
2)    for  the  hidden  layer.  Mathematically, 

JVa-UVi*Ml  ■*[ir/(^-1)],  (2-7) 

where  .*  denotes  component -by- component  multiplication  of 
arrays  [Ref.  3:  pp. 51-53] . 

B.   INTEGRATION  OF  NEURAL  NETWORKS  AND  ADAPTIVE  CONTROL 

As  suggested  earlier,  the  versatility  of  neural  networks 
makes  them  prime  candidates  for  nonlinear  controllers.  The 
vast  majority  of  LTI  systems  are  able  to  be  controlled  by 
linear  quadratic  regulators  (LQRs) .  However,  the  LQR  itself 
may  not  exhibit  satisfactory  performance  in  the  presence  of 
large  uncertainties.  That  is,  inaccuracies  in  the  estimation 
of  system  parameters  or  nonlinearities  in  the  system  may  cause 
the  LQR  to  lose  its  optimality.  In  particular,  nonlinear 
system  dynamics  can  not  be  accounted  for  in  LQR  design. 

The  proposal  here  is  to  derive  a  nonlinear  adaptive 
regulator  which  uses  neural  networks  to  compensate  for  the 
nonlinear  dynamics.  This  regulator  will  include  two 
backpropagating  neural  networks  (BNN's) :  one  for  modeling  and 
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one  for  control.   The  modeling  BNN  is  used  to  find  a  more 
accurate  state  vector  estimate  than  that  found  using  the  given 
system  parameters.  The  control  BNN  modifies  the  control  input 
by  adding  nonlinear  terms  to  the  LQR. 
1.   Linear  Quadratic  Regulator 

The  LQR  is  a  state  feedback  controller  which  minimizes 
a  performance  index  known  as  the  cost  function.  Consider  the 
LTI  system: 

Xc+1  =  Axt  +  Buc,  (2-8) 

where  x,  -  (x^,  x^,,  ...  ,  xn4)T  is  the  n~ dimensional  state 
vector,  i^  »  {uu,  u2jt,  ...  ,  uma)T  is  the  m- dimensional  control 
vector,  and  A  and  B  are  the  system  parameters  of  dimensions  (n 
x  n)  and  (n  x  m) ,  respectively.  The  LQR  is  a  regulator  which 
minimizes  the  cost  function: 


B-   |E  <*?0*e  +  uj*uc)  (2"9) 


where  Q  is  an  (n  x  n)  symmetric  and  positive  semi -definite 
matrix  and  R  is  an  (in  x  m)  positive  definite  matrix.  The 
optimal  control  u,  is  defined  by 

ut  =  -Kxc   .  (2-10) 

where 
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K=    [BTPB  +  R]-XBTPA.  (2-11) 

K  is  the  (in  x  n)    optimal  feedback  gain  matrix  and  the  (n   x  n) 
matrix  P   is  the  solution  to  the  algebraic  Riccati  equation 
[Ref .  4:  p. 320] : 

P  =  A  TPA  +  0- A  TPB  [B  TPB+R]  ~XB  TPA .  (2  - 12 ) 

Obviously,  the  LQR  does  not  take  nonlinear  system 
dynamics  into  consideration,  but  it  does  give  a  good  baseline 
for  comparison  to  the  nonlinear  regulator  derived  in  the  next 
section. 

2.   Backpropagating  Neural  Network  Design 

The  system  with  the  nonlinear  regulator  is  shown  in 
Figure  5.  The  modeling  and  control  BNNs  are  labeled  BNNM  and 
BNNC,  respectively.  The  modeling  BNN  is  connected  in  parallel 
with  the  estimated  plant  parameters,  labeled  Ae  and  Be.  The 
control  BNN  is  connected  in  parallel  with  the  LQR,  labeled  K. 
The  block  labeled  D   denotes  a  unit  time  delay. 

The  nonlinear  plant  will  be  modeled  as  a  linear  time- 
invariant  system  with  a  nonlinearity  added: 

xc+x  =  Axe  +  But  +  vc  (2-13) 

where  vt   is  the  n- dimensional  nonlinear  function.   First,  a 
linear  estimate  of  the  state  equation  is  made: 
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Figure  5.   Regulator  Block  Diagram 

xt+1  =  Aext  +  Baut.  (2-14) 

The  estimates  of  the  system  parameters  may  be  computed  by  any 
number  of  methods.  This  linear  estimate  will  deviate  from  the 
actual  system  output  due  to  the  nonlinear  system  dynamics  and 
the  inaccuracies  in  parameter  estimation. 

In  order  to  compute  a  more  accurate  estimate  of  xt+JI 
the  modeling  BNN  is  used  to  produce  an  adjustment  to  x,+I.     The 
modeling  BNN  has  the  {m   +  n)  -dimensional  input  vector  (u,,  x±) 
and  the  n- dimensional  output  vector  5^+/.  Therefore,  the  new 
estimate  of  xt+l   is: 


15 


*e*i  =  *t+i+**C+i  (2-15) 

where 

**t*i  ~   ?(«(«'()  •  (2-16) 

The  modeling  BNN  is  trained  so  that  the  error  between  x,+/  and 
xt+1   is  minimized. 

The  optimal  feedback  gain  matrix  K  is  computed  from  Ae 
and  Be  found  in  equation  (2-14)  and  user-defined  weighting 
matrices  Q   and  R.      The  linear  control  input  is  computed: 

Uc  =  -Kxt.  (2-17) 

As  discussed  earlier,  this  controller  is  no  longer  optimal  due 
to  the  nonlinear  system  dynamics  and  the  parameter 
inaccuracies.  The  control  BNN  is  used  in  order  to  optimize 
the  control  input.  The  control  BNN  has  the  n-dimensional 
input  vector  (jq)  and  the  m- dimensional  output  vector  5i^.  The 
control  input  u^  is  generated  by 

ut  =  ut  +  &ut  (2-18) 

where 

8uc  =  h(xt)  .  (2-19) 

The  control  BNN  is  trained  so  that  the  control  input  u, 
minimizes  the  cost  function  of  equation  (2-9). 
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3 .   BNN  Training  Algorithms 

The  following  notation  is  used  for  both  the  modeling 
and  the  control  BNN: 

•  n   refers  to  a  particular  layer  of  the  network; 

•  the  ith  node  in  the  nth  layer  is  designated  node(n,    i) ; 

•  M  is  the  total  number  of  layers  in  the  network,  including 
the  input  and  output  layers; 

•  Nn   is  the  number  of  nodes  in  the  nth  layer; 

•  xTu   is  the  output  of  the  node(n,    i)   at  time  t; 

•  an(j   is  the  linkweight  from  the  node(n,   j)    to  the  node(n   + 
1,  ±)  . 


The  functional  representation  of  the  node(n,    i)    is  found  by: 

z".  =  V  a?~}x?~l  (2-20) 

J'-l 

or 

zn   =   gfl-l^n-l  (2-21) 

where  aTl   is  the  (Nn   x  Nn.j)  -dimensional  linkweight  matrix  and 

*i.t   =  f^i.t)  (2"22) 

where  x"/r  is  the  ith  element  of  the  vector  input  to  layer  n. 
The  function  f  (• )    is  chosen  to  be  the  sigmoid  function 
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fix)    = —  (2-23) 

for  the  hidden  layers  and 

f(x)    =  x  (2-24) 

for  the  output  layer.   In  accord  with  this  notation,  the 
dimensions  Nj   =  m  +  n   and  NM   =  n   in  the  modeling  BNN,  and  Nt   = 
n   and  NM   =  m   in  the  control  BNN  where  n  and  m   are  the  number 
of  columns  in  the  system  matrices  A  and  B,    respectively. 
a.  Modeling  BNN 

In  deriving  the  modeling  BNN,  the  first  step  is  to 
determine  the  error  function  which  will  be  used  for  training. 
Since  the  goal  is  to  produce  an  accurate  estimate  of  Xj+1,  the 
error  function  is  chosen  to  minimize  the  difference  between 
x,+1   and  jq+/: 

B&i   =  \  (£c+1-xt+1)r(.*c>1-xctl)  .  (2-25) 

The  linkweights  are  updated  using  a  version  of  the  gradient 
descent  algorithm: 

«uw  a  •t.j.t  -  v>m^t  (2"26) 
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where  fiM  is  a  step  size  parameter  which  controls  the  learning 
rate  of  the  linkweights.  The  partial  derivative  of  the  cost 
function  is  computed  as  follows: 

ggeg  .  d      ,  1  {±  )T{±  }] 

-  a  ~    n     l  o   v"*e*i  *t*i'  v"*t*i  *t+\'  J 

=  (S^-X^)3"  -A"  (*e*-*e*>  (2-27) 


M 

=  (x      -x     ) 


r  ^e*i 


where 


aai°. 


dai#J   dai#i 


Since  x,+1   is  independent  of  the  linkweights 


-  J. 


g(Uc,Xc)  • 


(2-29) 


The  function  g   is  the  mapping  function  for  the  layers  of  the 

network  described  earlier. 

Therefore, 

^£1  =  (Jrc+1-xc+1)r  — t-g{ut,xc)   .        (2-30) 
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The  computation  of  {dg/da)    is  found  next  [Ref .  2 
p. 412]  : 

d 


where 


gr(uc,jcc)  =  tiiitxjit  (2-31) 


N„ 


Alt -*'<*#>  E*SAK-  <2"32> 


The  function  f'(x)  denotes  the  first  derivative  of  f  (x)  at  x. 
The  initial  condition  on  A  is 

mx  i  (2-33) 

Af.J  =  [0,.„,0,l,0,...,0]p. 

Therefore,  the  linkweight  training  is  accomplished  as: 

ajli.t*  =  «&.*  "  |*w(it+1-xt+1)rA?,tx/t      (2-34) 

1).   Control  BNN 

Finding  the  error  function  for  the  control  BNN  is 
more  involved  than  for  the  modeling  BNN.  If  the  plant  was  an 
LTI  system,  the  control  input  i^  which  minimizes 

Ecc  =  1   (x£rtlPxm+uJi?ut)  (2-35) 

is  given  by  equations  (2-10)  through  (2-12)  [Ref.  2:  p. 413]. 
This  implies  that  the  LQR  generates  a  control  'input  which 
minimizes  equation  (2-35)  for  each  time  t.  This  control  input 
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is  suboptimal  even  if  the  system  is  LTI .  The  nonlinearities 
in  the  system  make  the  LQR  even  less  optimal.  The  control  BNN 
is  connected  in  parallel  with  the  LQR  to  adjust  for  control 
errors  caused  by  parameter  inaccuracies  and  nonlinear  system 
dynamics.  Therefore  an  error  function  must  be  chosen  which 
makes  the  output  of  the  control  BNN  equal  zero  if  the  plant 
parameters  are  exactly  expressed  by  Ae  and  Be  and  the 
nonlinearities  are  set  equal  to  zero.  Equation  (2-35)  is  a 
good  choice  for  the  error  function  at  time  t,  but  xt+1  is  not 
available  at  time  t.  It  is  replaced  by  xt+1  which  is  found 
with  the  modeling  BNN.  The  error  function  for  the  control  BNN 
is  now 

Ecc  =  1  (±^P±t^^uTtRue)  .  (2-36) 

The  linkweight  for  the  control  BNN  is  defined  as 
If,  j  in  order  to  differentiate  it  from  the  modeling  BNN 
linkweight,  an,j.  The  control  linkweights  are  updated  by  the 
gradient  descent  algorithm: 

dEc 
hli.t+i  *b?tj.t  -   nc-^i  (2-37) 

where  \ic   is  the  learning  rate  parameter. 
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Equation  (2-19)  defines  the  nonlinear  function  for 
the  control  BNN.  This  is  used  to  calculate  the  training 
algorithm  as  follows: 


dbitJ       dbitj     * 

dbftJ  db*tj 


(2-38) 


First,    Ox,+i/3Jb)   will  be  derived  followed  by  the  derivation  of 
(diif/db)  .      Substituting  equation    (2-15)    into    {dxt+1/db)    gives: 

dSt  a 

TT~   m-£T  <**  +  ***«)'  .  (2-39) 

Next,    equations    (2-14)    and    (2-18)    are  used: 
d3fc+i   _      d 


a 


uext+£eut+a*t+1) 

Uexe+Be(uc  +  5ut)  +  5xc+1) 


(2-40) 


db? 
but  Xj   and  Uj   are  independent  of  Jb,  therefore 

dx  a 

1^  '  lEf/BM^ix^  ■  <2"41> 

Substituting  in  equations  (2-19)  and  (2-16)  results  in 

dxc+1     dh{xt)       dg{ut.,xc) 

IT  =  Se ;T~+  — a  •  '  (2-42) 

dbi  t  j  db±  t  j  db±  t  j 
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The  second  partial  derivative  in  equation  (2-42) 
is  the  partial  derivative  of  a  function  of  u,  which  is  itself 
dependent  upon  Jb,  therefore  the  derivative  must  be  split  as 
follows: 

3±t»i  -  B   dh<xt)  +  dg(ut,xe)     due  (2-43) 

Bbt.j         e   dbffj  dut         Bbty 

But  the  partial  derivative  of  i^  may  be  further  simplified  by: 

But        d(ut  +  &ut) 


dbitj  db*j 

8(6ue) 
dbf.j 
dh(xe) 

Substituting  equation  (2-44)  into  equation  (2-43)  : 

d±c^   _b  dh(xc)  ^dg{ut,xt)    dh(xt) 
dbt.j  '    *    dbf.j  <*"c  db?tJ 

(B  ¥*S!*t^)tol*J.. 


(2-44) 


(2-45) 


Using  equation  (2-31)  ,  (dh(x,)/db)  may  be  computed  in  the  same 
manner  as  (dg-Ci^x^/da)  .  The  (n  x  m)  matrix  {dgdi,,^) /dxij)  is 
found  by  [Ref.  2:  p. 413] : 

Bgiu     xt)    M«g    fr  (2.46) 
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The  control  linkweight  training  algorithm  may  be 
summarized  as: 


bi,j,t+l  =  bi.j,  t~V>( 


af-iP 


B, 


dg(ut,xt)  ) 


+  uTtR 


li.tXf.t.      (2-47) 


The  training  algorithm  for  both  BNNs  may  be 
summarized  as  follows: 

•  observe  the  state  vector  x;  at  time  t, 

•  update  the  linkweight  matrices  a,    using  equation  (2-34) , 

•  compute  the  control  input  i^  with  equations  (2-17) ,  (2-18) 
and  (2-19) , 

•  calculate  the  state  vector  estimate  j^+/  using  equations 
(2-14),  (2-15) ,  and  (2-16), 

•  update  the  linkweight  matrices  Jb,  using  equation  (2-47) . 

This  is  repeated  at  each  successive  time  t,  as  the  new  state 
vector  becomes  available. 
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III.   SIMULATION  RESULTS  AND  DISCUSSION 

A.   NONLINEAR  SYSTEM  DESIGN 

In  order  to  test  the  LQR  and  the  nonlinear  regulator,  a 
nonlinear  system  was  required  to  be  modeled.  This  system  was 
modeled  with  a  linear  part  which  was  then  acted  upon  by  a 
nonlinear  function.  The  linear  portion  was  artificially 
estimated  to  provide  a  more  realistic  case  study.  The  linear 
system  used  as  a  baseline  was  first  presented  as  a  transfer 
function  in  continuous  time,  transformed  to  a  state  space 
representation,  and  then  converted  to  discrete  time. 

The  chosen  transfer  function  was  unstable  with  poles  at 
+2.00  and  -1.00,  a  zero  at  +0.10,  and  a  gain  of  10: 


Y(s)    _  lO(s-O.l) 
U(s)    "    (sr  +  1)  (s-2) 


(3-1) 


The  equivalent  of  this  transfer  function  in  state  space  format 
is  [Ref .  1:  pp. 675-677] : 


= 

0    1" 
2    1. 

y + 

'10 
9 

Acy 

>Bcu 

a 

u 


(3-2) 
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This  equation  is  converted  to  discrete  time  using  a  truncated 
infinite  series  to  calculate  the  state  transition  matrix  and 
input  matrix  [Ref.  4:  pp. 123-127]  with  a  sample  time  of  0.1 
seconds.   The  discrete  time  equation  is: 


yt*i  = 


1.0104  0.1055 
0.2110  1.1159 
=  Ayt^But. 


yt  + 


1.0500 
1.0533 


u, 


(3-3) 


Estimated  values  (Ae  and  Be)  were  found  for  the  purpose  of 
illustrating  the  relative  robustness  of  the  LQR  and  the 
nonlinear  regulator.  These  values  were  found  by  taking  the 
members  of  A  and  B  and  calculating  values  which  differed  from 
the  actual  numbers  by  ±10-20  percent. 

The  nonlinear  part  of  the  system  equation  was  modeled  as 
follows: 


*i,t+i  =  yi,t+i+€(1-exP(y2.t*i)) 

*2,t*i  =  y2,t+1+€(l-exp(ylftn)) 


(3-4) 


The  value  e  is  a  control  on  the  level  of  nonlinearity. 

B.   LQR  SIMULATION  RESULTS 

The  LQR  is  calculated  using  Ae    and  Be.        The  weighting 
matrices  Q   and  R   are  set  at: 


Q  = 


0  0 
0  1 


and    R   =  1 


(3-5) 
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The  solutions  to  equations  (2-11)  and  (2-12)  are: 


K  =  [0.1775    0.5057]     and    P  = 


0.0771    0.1060 
0.1060    1.3859 


(3-6) 


Figures  6  and  7  illustrate  the  performance  of  the  LQR  to 
a  unit  impulse.  Figure  6  shows  the  system  response  for  e  = 
0.05.  The  state  variables  quickly  converge  to  zero.  Figure 
7  shows  the  response  to  the  same  input  for  e  =  0.20.  It  can 
be  seen  that  the  LQR  is  not  sufficiently  robust  to  cause  the 
state  variables  to  converge  to  zero.  The  plant  converges  to 
a  nonzero  equilibrium  value. 

C.   NONLINEAR  REGULATOR  SIMULATION  RESULTS 

Both  BNN's  used  here  are  three- layered  neural  networks 
with  30  nodes  in  the  hidden  layer.  In  the  modeling  BNN,  N7  = 
in  +  n  =  3,  #2  =30,  and  N3  »  n  ■  2.  In  the  control  BNN,  Nt  = 
n  m  2,  N2  =  30,  and  N3  =  m  =  1.  The  learning  rate  for  both 
BNN's  was  set  at"  0.05.  The  initial  condition  on  the 
linkweight  matrices  was  to  fill  them  with  small  random  numbers 
(normal  distribution,  standard  deviation  =  0.1). 

Figure  8  plots  simulation  results  for  the  nonlinear  plant 
with  e  =  0.05.  The  results  are  quite  similar  to  those  of 
Figure  6.  For  this  level  of  nonlinearity,  the  nonlinear 
regulator  causes  the  state  variables  to  converge  to  zero 
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Figure  6.   LQR  Result  for  e  -  0.05 
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Figure  7.   LQR  Result  for  e  =  0.20 
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faster  than  the  LQR,  but  the  response  is  slightly  more 
oscillatory. 
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Figure  8.   Nonlinear  Regulator  Result  for  €  =  0.05 

Figure  9  shows  simulation  results  for  the  nonlinear  plant 
with  e  =  0.20.  Comparing  these  results  to  those  of  Figure  7 
shows  that  the  nonlinear  regulator  reduces  the  state  variables 
to  zero  in  a  case  where  the  LQR  is  unable  to  do  so.  While  the 
LQR  stabilizes  the  system  for  relatively  large  factors  of  e, 
it  is  of  limited  utility  once  the  state  variables  no  longer 
converge  to  zero.  The  state  variables  converge  to  zero  using 
the  LQR  up  to  €  =  0.18  while  the  nonlinear  regulator  continues 
to  drive  the  state  variables  to  zero  up  to  e  =  0.21.  The 
performance  of  the  nonlinear  regulator  is  better  than  the  LQR 
for  e  a  0.06. 
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Figure  9.   Nonlinear  Regulator  Result  for  €  =  0.20 

1.   BNN  Parameter  Variation 

The  next  area  of  investigation  was  to  vary  some  of  the 
parameters  of  the  neural  network  to  examine  the  effects  upon 
the  regulator.  This  was  accomplished  by  holding  all 
parameters  constant  with  the  exception  of  the  one  in  question. 
The  performance  index  W  is  found  by  summing  the  absolute 
values  of  the  state  variables  (x]t  and  x2t)  for  the  specified 
time  range.  The  performance  index  W  was  calculated  for  each 
instance,  again  for  a  unit  impulse  over  a  range  of  20  seconds. 

One  of  the  major  variances  in  the  performance  of 
neural  networks  is  caused  by  the  number  of  nodes  in  the  hidden 
layer  of  the  network.   This  number  (N2)    was  varied  from  10  to 
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40.  The  results  are  plotted  in  Figure  10.  The  straight  line 
(without  point  markers)  shows  the  performance  of  the  LQR  with 
no  neural  network.  Figure  10  shows  that  adding  more  nodes  to 
the  hidden  layer  improves  the  performance  of  the  regulator. 
However,  when  6  is  small,  the  LQR  yields  better  performance 
than  the  nonlinear  regulator  with  any  number  of  nodes.  As  € 
increases  above  0.15,  the  performance  degrades  and  the  number 
of  nodes  has  less  of  an  effect.  There  are  limits  upon  the 
number  of  nodes  used.  When  the  regulator  was  tried  with  50 
nodes  in  the  hidden  layer,  the  system  became  unstable. 
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Figure  10.   Variation  in  W  due  to  Number  of  Nodes 

The  learning  rate  was  also  varied*  to  determine  its 
effect  upon  regulator  performance.   This  step  size  parameter 
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had  an  effect  on  W  similar  to  that  of  varying  N2.  Both  the 
modeling  and  the  control  BNN's  had  their  learning  rate  changed 
to  the  same  value.  Figure  11  shows  the  effect  of  this  upon  W. 
An  interesting  feature  of  Figure  11  is  that  while  increasing 
the  learning  rate  caused  better  performance,  this  was  only- 
true  up  to  /i  -  0.05.  When  fi  was  increased  beyond  this, 
performance  was  degraded.  Increasing  fi  to  0.10  caused  the 
system  to  become  unstable. 
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Figure  11.   Variation  in  W  due  to  Learning  Rate 

2.   BNN  Training 

One  of  the  advantages  of  this  regulator  design  is  that 
there  is  very  little  training  required  in  order  to  achieve 
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convergence  of  the  linkweight  values.  All  simulation  mans 
were  completed  with  a  single  unit  impulse  training  run.  One 
of  the  major  drawbacks  to  the  majority  of  neural  network 
applications  is  the  amount  of  training  inputs  required  for 
convergence  [Ref.  3:  p. 88].  The  apparent  reason  for  this 
rapid  convergence  is  that  the  BNN's  are  not  providing  all  of 
the  state  vector  estimate  and  control  input:  they  are  only 
adding  a  correction  to  the  linear  state  vector  estimate  and 
the  LQR  control  input. 
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IV.   CONCLUSIONS 

A.  SUMMARY 

Starting  with  the  rationale  for  developing  an  adaptive 
regulator  for  nonlinear  systems,  the  development  of  a 
nonlinear  regulator  which  used  backpropagating  neural  networks 
in  conjunction  with  a  linear  quadratic  regulator  in  an 
adaptive  control  system  was  proposed.  The  regulator  was 
derived  and  applied  to  control  a  representative  nonlinear 
system. 

B.  SIGNIFICANT  RESULTS 

Simulations  of  the  BNN-based  nonlinear  regulator  were 
conducted  on  a  representative  nonlinear  system.  The  main 
observations  from  these  simulations  are  summarized  below: 


The  results  show  that  the  nonlinear  regulator  works  well 
in  the  control  of  a  nonlinear  system.  It  was  also  seen 
that  using  a  LQR  on  the  system  works  if  the  nonlinearity 
is  small. 

There  are  many  variables  involved  in  the  regulator  design 
which  must  be  optimized  by  trial  and  error.  Definitive 
rules  to  govern  the  selection  of  these  variables  would 
significantly  _  decrease  the  time  to  arrive  at  an 
appropriate  controller. 

The  amount  of  training  required  for  the  regulator  is 
minimal.   One  pulse  is  sufficient  to  train  the  network. 

This  regulator  is  unproven  for  all  nonlinear  systems.  Its 
utility  may  be  limited  to  those  which  may  be  modeled  as  a 
linear  system  with  an  added  nonlinear  component. 
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C.   FURTHER  RESEARCH 

The  emphasis  in  this  thesis  was  to  develop  a  BNN-based 
nonlinear  regulator  that  would  use  the  a  priori  knowledge  of 
system  parameters  to  find  a  controller  that  could  function 
more  efficiently  than  one  which  did  not  utilize  this 
knowledge.  The  neural  networks  used  in  this  research  only 
contained  one  hidden  layer.  While  this  was  sufficient,  the 
rules  used  in  deriving  the  BNN's  for  the  regulator  could  be 
used  to  design  neural  networks  with  numerous  hidden  layers. 
Guidelines  should  be  developed  which  would  govern  the  choice 
of  the  number  of  layers  used  in  the  neural  networks,  the 
number  of  nodes  necessary  in  the  hidden  layers,  the  learning 
parameters,  and  the  weighting  matrices  for  the  error 
functions . 
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APPENDIX  A:   NONLINEAR  REGULATOR  SIMULATION 

A.   SYSTEM  DERIVATION 

A  nonlinear  regulator  using  backpropagating  neural 
networks  in  conjunction  with  a  linear  quadratic  regulator  was 
designed  in  Section  B  of  Chapter  II.  This  appendix  details 
the  procedure  used  to  arrive  at  the  simulation  results  of 
Chapter  II  and  includes  the  software  written  for  the  research. 

The  representative  nonlinear  system  chosen  for  examination 
has  the  continuous  domain  transfer  function: 

Y(s)    m     IQ(s-O.l)     lOs-1  ..    -. 

U(s)  (s  +  1)  (s-2)  "  s2-s-2' 

• 

This  equation  must  be  transformed  to  state  space  format.  This 
is  accomplished  using  the  derivation  shown  below  [Ref.  1: 
p. 675-677]  . 

Yis)    .  b0sn+b1sa-1+~+bn_1s+bn  (A2) 


U(s)  sn  +  axsn-1+~  +  an_1s  +  an 

This  may  be  rearranged  and  transformed  to  an  nth- order  system 
of  linear  differential  equations: 

y<fl>+a1y(a-1)+.»  +  aay  =  2?0u(n)+^1u(n-1)+-+2?nu     (A- 3) 
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where  yn;  and  u(n)  are  the  nth  order  derivatives  of  y  and  u.     The 
state  variable  is  chosen: 

xx   =  y 

*>   =  *3  (A-4) 

xn  =  'aax1-aa.1x2 a1xa+b0u{n)  +Jb1u(a~1)  +«-+jbnu. 


However,  xi  ■  y  may  not  yield  a  unique  solution  due  to  the 
derivatives  of  the  forcing  function.   The  state  variable  are 

0 

redefined  as 

*a  =  y-Pou 

x2  =  y-Poii-PiU  =  xx-^xu 

xz  =  y-p0u-pxu-p2u  =  x2-p2u  (A-5) 

where  j30,    0lt  . . .  ,0n  are  determined  from 

Po   =*>o 

Pi   =^i-a1P0 

P2  =  Jb2-a1P1-a2p0  (A-6) 

Pn  =  ^n-<aiPn-l--«an-lPl-anP0. 

With  this  choice  of  state  variables,  the  following  state  and 
output  equations  are  found: 
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where 


x  =  Acx+Bcu 
y  =  Ccx+Dcu 


(A- 7) 


X  = 


"n-1 


A„  = 


0  0 

-a„   -a 


an-2 


0 
0 

1 

-a, 


*c  = 


Pi 
Pa 

Pa-i 
P* 


Cc  =  [1    0    •••   0],       Dc  =  p0  =  jbfl 


(A- 8) 
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B.       SIMULATION   SOFTWARE 


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

% 

%  LQR.M 

% 

%   This  program  computes  a  linear  quadratic  regulator  for  the 

%   stated  system  and  simulates  the  response  to  a  unit 

%   impulse.   No  neural  network  adjustment  is  provided. 

%   This  is  a  stand-alone  program. 

%   LT  Kurt  Menke,  10  June  1992 

% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

% 

% 

clear;  clg 

% 

% 

%   The  first  part  of  this  program  converts  the  system 

%   from  continuous  time  transfer  function  to  discrete 

%   domain  state  space  equations. 

% 

% 

%   The  numbers  for  num  and  den  are  for  the 

%   representative  system  of  Chapter  III.   This  is  set  up 

%   for  a  second  order  system  with  a  single  input. 

%   It  may  be  adapted  for  any  order  system  using  the  equations 

%   in  Appendix  A. 

num  =  [0  10  -1] ; 

den  =  [1  -1  -2] ; 

% 

% 

%   This  transformation  to  state  space  follows  the 

%   format  of  Appendix  A. 

al  =  den (1,2) 

a2  =  den  (1,3) 

bO  =  num(l,l) 

bl  =  num  (1,2) 

b2  =  num  (1,3) 

betaO  =  bO; 

betal  =  bl  -  al*beta0; 

beta2  =  b2  -al*betal  -a2*beta0; 

% 

% 
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%   Continuous  state  space  equations 
Ac  -  [  0    1; 

-a2  -al]  ; 
Be  =  [betal;  beta2] ; 
Cc  =  [1   0]  ; 
Dc  =  betaO; 
% 

%disp (' Sample  time  for  continuous') 
dt  -  input ( ' to  discrete  conversion?   ' ) ; 
% 
% 
%   Discrete  state  space  equations 

[A,    B]    -   c2d(Ac,Bc,dt) ; 

% 

% 

[na,ma]    ■   size (A); 

[nb,mb]    =   size(B) ; 

rand ( ' uniform' ) 

rand ('seed' , 0) 

% 

% 

%   Ar,  Br,  As,  and  Bs  are  used  to  compute  the  "estimate''  of 

%   A  and  B  used  in  the  computation  of  the  LQR. 

%   This  ensures  that  Ae  and  Be  are  within  +/-  10-20% 

%   of  A  and  B 

Ar  »  0.1  .*  rand (na, ma)  +  0.1; 

Br  =  0.1  .*  rand(nb,mb)  +  0.1; 

% 

% 

%   As  and  Bs  are  arbitrary  signs  which  make  the  particular 

%   value  of  Ar/Br  either  +10-20%  or  -10-20% 

As  =  [-1  1;  1  -1] ; 

Bs  »  [-1;  1] ; 

Ar  =  As  .*  Ar; 

Br  =  Bs  .*  Br; 

% 

% 

%   Ae  and  Be  are  the  estimates  of  A  and  B 

Ae  =  (Ar  .*  A)  +  A; 

Be  =  (Br  .*  B)  +  B; 

% 

% 

%   Weighting  matrices 

Q  =  [0  0;  0  1]  ; 

R  =  1; 

% 

% 

%   Computes  the  LQR  gain  matrix  (K)  and  the  solution  to  the 

%   algebraic  Riccati  equation  (P) . 

[K,P]  =  dlqr(Ae,Be,Q,R) ; 
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% 
% 

%   e  is  the  coefficient  which  controls  the  amount  of 

%   nonlinearity  seen  by  the  system. 

e  =  input ( 'Nonlinearity  coefficient?   '); 

% 

%%  Recommended  run  time:   10  or  20  seconds. 

time  =  input (' Amount  of  system  run  time?   '); 

% 

% 

%   Initial  values 

Nt    =  time/dt  +  1; 

xhat  -  zeros (na,Nt) 

y    =  zeros (na,Nt) 

x    -  zeros (na,Nt) . 

u    ■  [1  zeros (l,Nt-l) ] ;   %  Impulse  input  at  time  0 

% 

% 

for  t  =  2:Nt 

y(:,t)     -  A*x(:,t-1)  +B*u(:,t-1); 

% 

%  State  vector 

x(l,t)     -  y(l,t)  +  e*(l-exp(y(2,t))) ; 

x(2,t)     «  y(2,t)  +  e*(l-exp(y(l,t))) ; 

% 

%  Control  input 

u(:,t)     -  -K*x(:,t) ; 

% 

%  State  vector  estimate 

xhat(:,t+l)  =  Ae*x(:,t)  +  Be*u(:,t); 
end; 
% 
% 

%   Performance  calculation 
Wl  =  0; 
W2  =  0; 
for  t  =  l:Nt 

Wl  -  Wl  +  abs(x(l,t) ) ; 

W2  =  W2  +  abs(x(2,t) ) ; 
end; 

W  =  Wl  +  W2; 
% 
% 

%   Plot  of  state  vector  response  to  unit  impulse 
t=0:dt:time; 

plot (t,x(l, :) ,tfx(2, :) ) ,grid; 
text (0.8, 0.8,num2str (W) ,'sc'); 
title ( 'Linear  Quadratic  Regulator'); 
xlabel ( 'Time  (sec)'); 
ylabel ( 'System  Output' ) ; 
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%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
% 

%  NLREG.M 

% 

%   This  program  computes  a  nonlinear  regulator  for  the  stated 

%   system  and  simulates  the  response  to  a  unit  impulse.  The 

%   regulator  uses  two  neural  networks:   BNNC  and  BNNM.   This 

%   is  a  stand-alone  program. 

%   LT  Kurt  Menke,  10  June  1992 

% 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 

% 

% 

clear;  clg; 

% 

% 

%   The  first  part  of  this  program  converts  the  system 

%   from  continuous  time  transfer  function  to  discrete 

%   domain  state  space  equations . 

% 

% 

%   The  numbers  for  num  and  den  are  for  the 

%   representative  system  of  Chapter  III.   This  is  set  up 

%   for  a  second  order  system  with  a  single  input. 

%   It  may  be  adapted  for  any  order  system  using  the  equations 

%   in  Appendix  A. 

num  -  10  *  [0  1  -.1]  ; 

den  =  conv( [1  -2] ,  [1  1]  )  ; 

% 

% 

%   This  transformation  to  state  space  follows  the 

%   format  of  Appendix  A. 

al  =  den (1,2) 

a2  =  den (1,3) 

bO  =  num (1,1) 

bl  =  num (1,2) 

b2  =  num (1,3) 

betaO  =  bO; 

betal  =  bl  -  al*beta0; 

beta2  -  b2  -al*betal  -a2*beta0; 


%   Continuous  state  space  equations 
Ac  =  [  0    1; 

-a2  -al] ; 
Be  =  [betal;  beta2] ; 
Cc  -  [1  0]  ; 
Dc  =  betaO; 
% 
% 
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disp( 'Sample  time  for  continuous') 

dt  =  input (' to  discrete  conversion?   '); 

% 

% 

%   Discrete  state  space  equations 

[A,    B]    =    c2d(Ac,Bc,dt)  ; 

% 

% 

[na,ma]    ■  size (A)  ; 

[nb,mb]    =   size(B) ; 

rand ( ' uniform' ) 

rand( 'seed' , 0) 

% 

% 

%   Ar,  Br,  As,  and  Bs  are  used  to  compute  the  "estimate"  of 

%   A  and  B  used  in  the  computation  of  the  LQR. 

%   This  ensures  that  Ae  and  Be  are  within  +/-  10-20% 

%   of  A  and  B 

Ar  =  0.1  .*  rand  (na,  ma)  +  0.1; 

Br  ■  0.1  .*  rand(nb,mb)  +  0.1; 

% 

% 

%   As  and  Bs  are  arbitrary  signs  which  make  the  particular 

%   value  of  Ar/Br  either  +10-20%  or  -10-20% 

As  -  [-1  1;1  -1] ; 

Bs  -  C-l;l] ; 

Ar  -  As  .  *  Ar ; 

Br  =  Bs  .*  Br; 

% 

% 

%   Ae  and  Be  are  the  estimates  of  A  and  B 

Ae  -  (Ar  .*  A)  +  A; 

Be  =  (Br  .*  B)  +  B; 

% 

% 

%   Weighting  matrices 

Q  =  [0  0;  0  1]  ; 

R  =  1; 

% 

% 

%   Computes  the  LQR  gain  matrix  (K)  and  the  solution  to-  the 

%   algebraic  Riccati  equation  (P) . 

[K,P]  =  dlqr(Ae,Be,Q,R) ; 

% 

% 

%   Learning  rate  parameters.   LearnM  is  for  BNNM  and  LearnC 

%    is  for  BNNC. 

LearnM  =  0.05; 

LearnC  =  0.05; 

% 

% 
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%   e  is  the  coefficient  which  controls  the  amount  of 

%   nonlinearity  seen  by  the  system. 

e  =  input ( 'Nonlinearity  coefficient?   '); 

% 

% 

%   Recommended  run  time:   10  or  20  seconds. 

time  ■  input (' Amount  of  system  run  time?   '); 

% 

% 

dispC  Number  of  nodes  in  hidden  layer') 

Nm  -  input ('of  modeling  net  (BNNM) ?   '); 

% 

% 

disp ( ' Number  of  nodes  in  hidden  layer') 

Nc  =  input ('of  control  net  (BNNC) ?   '); 

% 

% 

%   Initialization  of  the  linkweight  matrices 

[al,a2]  =  netinitm(na,Nm,na+2) ; 

[bl,b2]  -  netinitm(mb,Nc,nb+l) ; 

% 

% 

%   Bias  value  for  neural  networks 

Bias  »  1; 

% 

% 

%   Initial  values 

Nt  =  time/dt  +  1; 

x    -  zeros (na,Nt) ; 

y    =  zeros (na,Nt) ; 

xbar  -  zeros (na,Nt+l) 

xhat  =  zeros (na,Nt+l) 

xdel  -  zeros (na,Nt+l) 

ex   =  zeros (na,Nt) ; 

u    =  [1  zeros (mb,Nt-l) ] ;   %  Impulse  input  at  time  0 

% 

% 

%   Training  run 

for  t  =  2:Nt 

y(:,t)       =A*x(:,t-l)  +B*u(:,t-1); 

%  State  vector 

x(l,t)       =  y(l,t)  +  e*(l-exp(y(2,t) ) ) ; 

x(2,t)      =  y(2,t)  +  e*(l-exp(y(l,t) ) ) ; 

%  Error  vector 

ex(:,t)      =  x( : , t) -xhat ( : , t) ; 

%  Training  of  linkweight s  for  BNNM 

[al,a2]  =  bpm(al,a2, [x( : , t) ;u( : , t) ;Bias] ,ex( : ,  t) , Learnm) ; 

%  Linear  control  input 

ubar(:,t)    =  -K*x(:,t) ; 

%  BNNC  output 

udel(:,t)    =  netm( [x(: , t) ;  Bias] ,  bl,  b2) ; 
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%  Control  input 

u(:,t)       =  ubar(:,t)  +  udel(:,t); 

%  Linear  state  vector  estimate 

xbar (:, t+1)  =  Ae*x(:,t)  +  Be*u(:,t); 

%  BNNM  output 

xdel(:,t+l)  =  netm( [x( : , t) ;  u(:,t);  Bias],  al,  a2) ; 

%  State  vector  estimate 

xhat (:, t+1)  ■  xbar(:/t+l)  +  xdel  ( : ,  t+1)  ; 

%  Training  of  linkweights  for  BNNC 

[bl,b2]    =  bpc(al,a2,bl,b2,x(: ,t) ,u(: , t) /xhat (: ,t+l) 

Bias, P, Be, R, Learn) ; 
end; 
% 
% 

%   Resetting  initial  values 
x    =  zeros (na,Nt) ; 
y    =  zeros (na,Nt) ; 
xbar  =  zeros (na,Nt+l) ; 
xhat  =  zeros (na,Nt+l) ; 
xdel  =  zeros (na,Nt+l) ; 
ex   =  zeros (na,Nt) 
ubar  =  zeros  (nti^Nt) 
udel  »  zeros  (mb^t) 
u    -  [1  zeros (mb,Nt-l) ] ;   %  Impulse  input  at  time  0 
% 
% 

V   Simulation  run 
for  t  =  2:Nt 

y(:,t)       =A*x(:/t-l)  +B*u(:,t-1); 

%  State  vector 

x(l#t)      =  y(l,t)  +  e*(l-exp(y(2,t) )) ; 

x(2,t)      =  y(2,t)  +  e*(l-exp(y(l,t))) ; 

%  Error  vector 

ex(:ft)      =  x( : , t) -xhat ( : , t) ; 

%  Training  of  linkweights  for  BNNM 

[al,a2]  =  bpm(al,a2, [x( : , t) ;u( : , t) ;Bias] ,ex( : , t) , Learnm) 

%  Linear  control  input 

ubar ( : , t)    =  -K*x( : , t) ; 

%  BNNC  output 

udel(:,t)    =  netm( [x( : , t) ;  Bias],  bl,  b2) ; 

%  Control  input 

u(:,t)       =  ubar(:,t)  +  udel(:,t); 

%  Linear  state  vector  estimate 

xbar (:, t+1)  =  Ae*x(:,t)  +  Be*u(:,t); 

%  BNNM  output 

xdel (:, t+1)  =  netm( [x( : , t) ;  u(:,t);  Bias],  al,  a2) ; 

%  State  vector  estimate 

xhat (:, t+1)  =  xbar (:, t+1)  +  xdel (:, t+1); 

%  Training  of  linkweights  for  BNNC 

[bl,b2]     =  bpc(al,a2,bl,b2,x( : , t) ,u( : , t) ,xhat ( : , t+1) , . 

Bias, P, Be, R, Learn) ; 
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end; 

% 

% 

% 

Performance 

calculation 

Wl  = 

0; 

W2  - 

0; 

for 

t  -  1; 

:Nt 

Wl 

-  Wl 

+  abs  ( 

x(l, 

,t)); 

W2 

=  W2 

+  abs  ( 

x(2, 

t))j 

end; 

W  =  Wl  +  W2; 

% 

% 

%   Plot  of  state  vector  response  to  unit  impulse 

t=0:dt:time; 

plot (t,x(l, :) ,t,x(2, :)) /grid; 

text (0.8, 0.8/num2str (W) , 'sc' ) ; 

title ('Linear  Quadratic  Regulator'); 

xlabel ( 'Time  (sec)'); 

ylabel ( ' System  Output ' ) ; 
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The  following  programs  are  the  subroutines  required  to  run 
LQR.M  and  NLREG.M. 


function  [al,a2]  =  netinitm(ny,M,nu) 

% 

%   Routine  for  initializing  the  neural  net  with 

%   small  random  numbers. 

% 

%        Function  call:       [al,a2]    =  netinitm(ny,M,nu) 

% 

%   where  ny  =  number  of  outputs 

%         M  =  number  of  nodes  in  the  hidden  layer 

%        nu  ■  number  of  inputs 

% 

%   LT  Kurt  Menke,  10  June  1992 

% 

rand ( ' normal ' ) 

rand ('seed' , 0) 

al  -  0.1*rand(M,nu) ; 

a2  -  0.1*rand(ny,M) ; 

return 


function  x3  -  netm(xl,al,a2) 
% 

%   Routine  for  calculating  the  output  of  a  neural  net. 

% 

%   Function  call:   x3  -  net (xl,al,a2) 

% 

%   where  xl  =  neural  net  input  vector 

%         al  -  linkweight  matrix  from  input  to  hidden  layer 

%         a2  -  linkweight  matrix  from  hidden  layer  to  output 

% 

%   LT  Kurt  Menke,  10  June  1992 

% 

x2  =  sigmoid (al*xl) ; 

x3  =  a2*x2; 

return 
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function  y  =  sigmoid (x) 

% 

%   Routine  for  calculating  the  sigmoid  of  a  vector  input 

% 

%   Function  call:   y  =  sigmoid (x) 

% 

%   where  x  =  vector  input 

%         y  =  vector  output 

% 

%   LT  Kurt  Menke,  10  June  1992 

% 

y  =  1  ./  (l+exp(-x) ) ; 

return 


function  y  -  dsig(x) 

% 

%   Routine  for  calculating  the  first  derivative  of  the 

%       sigmoid  of  a  vector  input 

% 

%   Function  call:   y  ■  dsig(x) 

% 

%   where  x  -  vector  input 

%        y  »  vector  output 

% 

%   LT  Kurt  Menke,  10  June  1992 

% 

temp  =  exp(-x) ; 

y  ■  temp./  (1  +  2*temp  +  temp.*temp); 

return; 
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function  [al,a2]  =  bpm(al,a2,xl/ex,mu) 
% 

%   Routine  for  updating  the  linkweight  matrices  for  BNNM. 
% 

%   Function  call:   [al,a2]  =  bpm(al,a2,xl, ex,mu) 
.  % 
%   where  al  =  linkweight  matrix  from  input  to  hidden  layer 
%         a2  -  linkweight  matrix  from  hidden  layer  to  output 
%        xl  =  neural  net  input  vector 
%        ex  -  error  vector  for  linkweight  adjustment 
%        mu  »  learning  rate  for  BNNM 
% 

%   LT  Kurt  Menke,  10  June  1992 
% 

z2  ■  al*xl; 
x2  =  sigmoid (z2); 
z3  -  a2*x2; 
% 

parE_al  ■  -diag(dsig (z2) ) *a2'*ex*xl' ; 
parE_a2  -  - ex*x2 ' ; 
% 

al  -  al  -  mu . *parE_al ; 
a2  =  a2  -  mu.*parE_a2; 
return 
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function  [bl,b2]  =  bpc (al, a2,bl,b2,x, u,xhat,Bias, P,B,R,mu) 

% 

%   Routine  for  updating  the  linkweight  matrices  for  BNNC. 

% 

%        Function  call:       [bl,b2]    =  bpm(al,a2 ,bl,b2,x,u,xhat , . . . 

%  Bias, P,B,R,mu) 

% 

%   where  al  -  BNNM  linkweight s  from  input  to  hidden  layer 

%        a2  ■  BNNM  linkweights  from  hidden  layer  to  output 

%         x  =  neural  net  input  vector 

%         u  »  control  input 

%       xhat  ■  state  vector  estimate 

%       Bias  =  bias  value  for  neural  net 

%  P  =  matrix  solution  to  algebraic  Riccati  equation 

%  B  =  system  input  matrix 

%  R  =  weighting  factor 

%         mu  -  learning  rate  for  BNNC 

% 

%   LT  Kurt  Menke,  10  June  1992 

% 

xla  =  [x;  u;  Bias] ; 

z2a  -  al*xla; 

x2a  ■  sigmoid (z2a) ; 

z3a  ■  a2*x2a; 

% 

xlb  ■  [x;  Bias] ; 

z2b  »  bl*xlb; 

x2b  =  sigmoid (z2b) ; 

z3b  =  b2*x2b; 

% 

parh_b2  =  x2b' ; 

parh_bl  =  diag (dsig (z2b) ) *b2' *xlb' ; 

% 

della  =  a2*diag(dsig(z2a) ) ; 

parg_h  ■  della* (sum(al' ) ) ' ; 

parE_c  =  xhat' *P* (B+parg_h)  +  u'*R; 

% 

deb2  =  parE_c . *parh_b2 ; 

debl  -  parE_c.*parh_bl; 

% 

bl  =  bl  -  mu.*debl; 

b2  =  b2  -  mu.*deb2; 

return 
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